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ABSTRACT 


Binary  testing  concerns  finding  good  algorithms  to  solve 
the  class  of  binary  identification  problems.  A  binary 
identification  problem  has  as  input  a  set  of  objects,  including 
one  marked  as  distinguished  (e.g.,  faulty),  for  each  object  an  a 
priori  estimate  that  it  is  the  distinguished  object,  and  a  set  of 
tests.  Output  is  a  testing  procedure  to  isolate  the 
distinguished  object.  One  seeks  minimal  cost  testing  procedures 
where  cost  is  the  average  cost  of  isolation,  summed  over  all 
objects.  This  is  a  problem  schema  for  the  diagnosis  problem: 
applications  occur  in  medicine,  systematic  biology,  machine  fault 
location,  quality  control  and  elsewhere. 

In  this  paper  we  extend  work  of  Garey  and  Graham  to 
assess  the  capability  of  a  fast  approximation  rule,  the  binary 
splitting  rule,  to  give  near  optimal  testing  procedures  when  the 
a  priori  estimates  are  arbitrary.  We  find  conditions  on  the  test 
set  such  that  the  approximation  error  reduces  nearly  to  that  of 
the  equally  likely  a  priori  estimate  case  of  Garey  and  Graham  and 
find  another  upper  bound  on  approximation  error  for  the  same  test 
set  conditions  which  works  very  well  under  a  priori  estimate 
assumptions  where  the  first  result  is  poor. 


Performance  Bounds  for  Binary  Testing 
With  Arbitrary  Weights 


1.  Introduction. 

The  binary  testing  problem  is  a  special  case  of  a  general 
diagnosis  problem  where  one  seeks  to  find  the  true  culprit  (say, 
disease)  from  among  n  candidates.  The  general  problem  has  been 

studied  for  many  years,  using  Bayesian  statistics,  decision 

tables,  information  theory  and  other  methods.  (See  Payne  and 
Preece  [9],  who  give  a  survey  of  the  entire  area.)  Here  we 

continue  an  investigation  into  the,  binary  testing  problem 

undertaken  by  Garey  and  Graham  (5).  (Earlier  work  on  binary 
testing  was  done  by  many,  e.g.  Chu  (1],  Slagle  (12), 
Garey  (2,3,4);  related  work  has  been  done  by  Reinwald  and 
Soland  (11),  and,  recently,  by  Moret  et  al.  (8).) 

Binary  testing  is  the  task  of  finding  good  algorithms  to 
solve  individual  binary  identification  problems.  A  binary 
identification  problem  consists  of: 

a)  a  set  0  of  n  objects  o^,...,on,  some  of  which  may  be 
distinguished  (e.g.  faulty)  objects; 

b)  a  corresponding  set  of  n  a  priori  probabilities  p^  (also 
called  object  weights) ,  satisfying  p^  >  0  where  p^  is 
regarded  as  the  a  priori  likelihood  that  o^  is  a 
distinguished  object; 
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c)  a  test  set  ^  *  (T^,...,Tm)  m  di®tinct  tests  over  0, 
each  test  T  identified  with  a  different  subset  8  of  O 
such  that  T  responds  "yes"  if  a  distinguished  object  is 
in  S;  otherwise  T  responds  "no". 

We  henceforth  assume  that  there  is  precisely  one 
distinguished  object  in  0.  Thus  we  also  stipulate  that  X  p^  *  1. 
For  any  test  T  we  write  T(o)  »  1  (or  o  £  T)  iff  the  distinguished 
object  is  in  the  subset  associated  with  T;  otherwise  T(o)  *  0  (or 
o  £  T) .  Which  object  actually  is  the  distinguished  object 
influences  neither  the  specification  of  the  binary  identification 
problem  or  its  solution  (see  below) ,  ,  because  which  object  is 
distinguished  is  considered  unknown  and  we  seek  a  procedure  to 
isolate  it. 

Further  assumptions  we  make  are  that  we  have  an  adequate 
test  set  to  isolate  any  object  as  distinguished,  and  that  all 
tests  have  unit  cost  of  application. 

A  solution  to  a  binary  identification  problem  is  a  binary 
decision  tree  that  is  a  procedure  for  applying  tests  to  determine 
the  distinguished  object.  A  solution  is  called  a  testing 
procedure.  (The  decision  tree  is  also  called  a  solution  tree.) 
A  decision  tree  is  simply  a  tree  graph  of  the  possible  paths  to 
follow  to  isolate  the  distinguished  object;  each  path  is  a 
sequence  of  tests  and  each  arc  from  a  node  is  chosen  by  the 
outcome  of  the  test  that  labels  the  node.  The  test  labeling  the 
root  is  always  the  first  test  applied.  Bach  leaf  is  labeled  by 
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an  object  name,  which  denotes  the  object  isolated  by  the  test 
outcomes  on  the  path  to  that  leaf.  (We  often  shall  replace  an 
object  name  by  its  object  weight  at  a  leaf,  whenever  clarity  is 
not  impaired.) 

The  value  of  a  testing  procedure  is  its  expected  cost. 
The  expected  cost  of  a  testing  procedure  is 

n 


where 

1^  is  the  path  length  to  object 

o^ ,  i »e • , 

the 

number  of 

tests 

executed  to  isolate  o^  as 

determined 

by 

the  testing 

procedure. 


Figure  1  presents  a  binary  identification  problem  and  a 
testing  procedure  with  its  expected  cost. 


i 

I 

i 

i 


In  this  paper  we  are  concerned  with  better  understanding 
how  well  a  well-known  algorithm  for  producing  testing  procedures 
does  in  general  circumstances.  The  reason  for  study  in  this  area 
is  the  general  importance  and  wide  applicability  of  the  diagnosis 
problem,  of  which  the  binary  testing  problem  is  an  important 
restriction.  The  reader  is  referred  to  an  excellent  survey  by 
Payne  and  Preece  [9]  where  references  to  applications  in  biology, 
medicine,  machine  fault  location  and  pattern  recognition  are 
given,  along  with  outlines  of  many  approaches  to  finding  good 
testing  procedures.  Understanding  the  binary  testing  problem 
took  a  big  step  forward  with  Garey  [2],  [4]  where  it  was  shown  how 
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to  obtain  testing  procedures  with  minimal  expected  cost  by  use  of 
dynamic  programming  algorithms.  However,  these  dynamic 

programming  algorithms  have  running  time  exponential  in  the  input 
else  (usually  dominated  by  the  listing  of  the  tests)  in  the  worst 
case.  Hyafil  and  Rivest  [6]  show  that  the  binary  testing  problem 
is  HP-hard  (also  see  Loveland  [7)),  which  (many  people  believe) 
implies  that  the  finding  of  optimal  testing  procedures  must  take 
exponential  time  in  the  worst  case.  This  focuses  attention  on 
approximation  algorithms  for  finding  testing  procedures,  which 
attempt  to  obtain  good,  but  not  always  optimal,  testing 
procedures  relatively  quickly.  Garey  and  Graham  [5]  studied  the 
binary  splitting  algorithm  for  finding  testing  procedures  for 
binary  identification  problems  because  this  algorithm  is  the 
essence  of  several  algorithms  offered  by  earlier  investigators. 
It  is  this  study  of  the  performance  bounds  for  the  binary 
splitting  algorithm  (defined  below)  that  we  extend. 

He  start  with  some  needed  definitions. 

If  S  £  0  then  let  I(S)  -  {i|oi  e  S) 

and  let 

p(S)  •  £  p. . 

iei(S)  x 

Also,  for  test  T  let 

p(T)  •  £  p, 

iei  (T)  1 
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where  I(T)  ■  {ilo^eT}. 

The  binary  splitting  algorithm  is  a  rule  for  choosing  a  next  test 
to  apply  at  any  decision  point  in  the  testing  procedure.  If 
SCO  and  S  contains  the  distinguished  object,  choose  the  test 
that  minimizes 

I  (pCSHT^/pfS))  -  1/2 1 . 

The  rationale  for  the  binary  splitting  algorithm  may  be  apparent 
to  every  computer  scientist;  it  embodies  the  "divide  (evenly)  and 
conquer"  approach.  For  the  binary  testing  problem  with  unit  cost 
tests  it  maximizes  the  reduction  in  uncertainty.  Thus  it  is  the 
restriction  of  various  entropy-based  splitting  rules. 

The  binary  splitting  algorithm  does  not  always  determine 
a  unique  testing  procedure  because  several  tests  may  meet  the 
selection  criterion  at  a  given  point.  We  will  consider  the  class 
of  all  testing  procedures  meeting  the  binary  splitting  algorithm 
condition. 

The  testing  procedure  of  Figure  1  is  a  binary  splitting 
testing  procedure.  So  is  the  testing  procedure  of  Figure  2, 
which  is  for  the  same  binary  identification  problem  and  yields  a 
better  expected  cost.  Thus  we  see  both  that  the  binary  splitting 
algorithm  need  not  specify  a  unique  testing  procedure  and  that 
such  a  procedure  need  not  be  optimal. 
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In  Section  2  certain  known  results  are  reviewed  including 
the  results  of  Garey  and  Graham  which  our  results  extend. 
Section  3  gives  an  example  due  to  Garey  and  Graham  that  shows  how 
bad  the  splitting  algorithm  can  be  for  arbitrary  weights.  Our 
results  follow  in  Sections  4  and  5. 

2.  Some  Known  Results. 

It  is  not  possible  to  give  all  the  known  results  ♦■hat 
relate  to  binary  testing;  a  fuller  summary  appears  in  Payne  and 
Preece  [9] . 

A  test  set  X  is  complete  iff  (if  and  only  if)  for  any 
set  S£0  (the  set  of  objects)  there  is  a  test  T  e  such  that  T«S 
or  0-T*S.  If  a  complete  test  set  is  given,  there  is  an  algorithm 
(essentially  the  Huffman  code  algorithm)  that  determines  the 
optimal  testing  procedure  for  arbitrary  object  weights.  The 
algorithm  is  linear  in  the  input  string  length  if  the  input 
object  weights  are  ordered  so  the  task  of  quickly  finding  minimum 
expected  cost  testing  procedures  is  solved  in  this  case  for  the 
restricted  problem  we  consider.  (When  tests  have  different  costs 
the  computation  becomes  much  more  complex,  but  this  problem  has 
been  tackled;  see  Picard  (10).) 

The  works  of  Garey  (4),  Garey  and  Graham  (5),  Ryafil  and 
Rivest  [6],  etc.,  discussed  earlier  concern  the  incomplete  test 
set  problem.  It  is  here  that  the  binary  splitting  algorithm  is 
used.  The  importance  of  the  work  of  Garey  and  Graham  (5)  is  that 


they  determined  a  strong  bound  on  how  poorly  the  binary  splitting 
testing  procedures  could  perform  relative  to  the  optimal  testing 
procedure  when  the  object  weights  are  equal.  Intuition  and 
experience  with  specific  problems  led  most  knowledgeable  people 
to  believe  that  the  splitting  algorithm  would  always  produce 
good,  if  not  perfect,  results,  especially  for  the  equal  object 
weight  case.  We  shall  state  the  results  of  Garey  and  Graham  and, 
in  the  following  sections,  pursue  the  same  question  when  the 
object  weights  are  arbitrary.  There  the  results  perhaps  are  as 
surprising  as  the  Garey  and  Graham  results  for  the  equal  object 
weight  problems. 

* 

Given  a  binary  identification  problem  with  (an 

incomplete)  test  set  T  of  unit  cost  tests,  let  *Qpt  denote  the 

* 

expected  cost  of  an  optimal  testing  procedure,  let  K  denote  the 
expected  cost  of  the  lowest  cost  binary  splitting  procedure  and 
let  K'  denote  the  expected  cost  of  the  worst  (highest)  cost 
binary  splitting  procedure. 

The  following  results  are  proven  in  Garey  and  Graham  (5) . 
We  write  log  n  for  log2n. 

Theorem  1  (Garey  and  Graham) .  There  exists  a  binary 
identification  problem  of  n  objects  with  equal  object  weights 


such  that 
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Theorem  2  (Garey  and  Graham) .  For  any  binary  identification 
problem  of  n  objects  with  equal  object  weights,  if  at  most 
c  log  n  tests  are  required  to  identify  any  object  in  the  optimal 
testing  procedure,  then 

K 1  _ 2c  log  n _  , 

Kopt  -  1  +  log  c  ♦  log  log  n  + 

A  lemma  used  by  Garey  and  Graham  to  prove  Theorem  2  is 
stated  here  also  because  we  will  have  cause  to  refer  to  it.  tre 
use  | S |  to  denote  the  cardinality  of  set  S. 


Lemma  (Garey  and  Graham) .  For  a  binary  identification  problem 
with  n  objects  of  equal  object  weights,  if  for  some  r, 
0  <  r  <  1/2,  test  set  T  satisfies  the  following  condition: 

for  all  S  £  O  such  that  |Sj  >  2,  there  exists  T  eT  such  that 

r | S |  <  |S  n  T|  <  (l-r)|S|, 


then 


K* 


< 


log  n 
r  log  ( 1/r ) 


+ 


1-r 

r 


The  special  case  of  simply  binary  identification  problems 
warrants  mention  because  of  the  type  of  test  employed.  A 
singleton  test  responds  positively  to  precisely  one  element,  in 
set  notation,  T  *  {o^}.  We  consider  test  sets  employing 
singleton  tests  later  in  this  paper.  A  simple  identification 
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problem  is  s  binary  identif ieation  problem  with  every  test  a 
singleton  test.  Because  the  test  set  must  be  adequate  an  n 
object  simple  identification  problem  must  have  n-1  singleton 
tests,  at  least.  For  this  class  of  binary  identification 
problems,  a  fast  algorithm  is  known  for  producing  optimal  test 
procedures  for  arbitrary  cost  test  sets  (see  Garey  [3]).  For  a 
related  problem  where  at  most  one  distinguished  object  exists, 
see  Chu  [1] . 


3.  The  General  Case. 

Garey  and  Graham  (private  comniunication)  have  discovered 
that  the  binary  splitting  algorithm  can  perform  very  badly 
relative  to  the  optimal  case  under  arbitrary  object  weights  in 
the  incomplete  test  set  situation.  The  following  example  shows 
that  there  is  a  family  of  binary  test  problems  such  that 


for  the  n  object  member  of  the  family.  Since  all  reasonable 
testing  procedures  have  expected  cost  no  greater  than  n-1,  this 
result  is  about  as  bad  as  could  be  expected.  (A  reasonable 
testing  procedure  would  eliminate  at  least  one  object  from 
consideration  with  each  test  so  no  path  would  have  more  than  n-1 
tests.) 


Lower  Bound  on  Worst  Case  (Garey  and  Graham) .  For  pedigogical 


purposes  it  is  convenient  to  let  n  be  even,  i.e. 
and  to  utilise  a  parameter  6  regarded  as  a  small 
number. 


Object 


Object  weight 


°0 

°2k-l' 
°2k  ' 
°2m+l 


1-6 

2-(k+l)  e 
2-(k+l)  e 
2-(«+l)  e 


Tests 

T  ■  0  universe 

o 

T1  "  t°2k-l! 

T2  *  i°2k: 

T3  ■  {o1#o2l 


n*2m+2, 

positive 


Ti  -  {°2i»5»02i-4^  3<i<m+2 
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We  first  find  an  upper  bound  for  KQ.  First,  consider  a 
path  to  isolate  Og.  Assuming  Og  is  the  distinguished  object  we 
see  that  then  T2  then  Ta+.j  eliminates  all  other  candidates  and 
so  defines  an  isolating  path.  Every  other  object  can  be  isolated 
within  m+1  tests  after  T^  or  or  T2  are  applied.  Thus 

(*)  Kopt  <  3(1-6)  +  6 (m+3) 

<  4  for  e  <  (m+3)”1  <  j 

* 

We  now  consider  a  lower  bound  for  K  .  Every  test  except 

Tq  has  weight  less  than  1/2  for  sufficiently  small  6;  indeed, 

U  T.  <  1/2.  Thus  the  binary  splitting  algorithm  will  first 
MO  1 

choose  the  test  other  than  Tg  with  the  largest  weight.  Test  T^ 
has  weight  whereas  tests  Tj^  and  T2  have  weight 

(2_1  -  2”^w+^)  e,  as  is  most  easily  seen  by  observing  that 
TjUT2UTn+3  has  weight  6,  that  T^,T2  and  Tm+3  are  disjoint  and 
that  T^  and  T2  are  symmetrical.  Let  us  suppose  the  test  responds 
negatively.  Then,  in  like  manner,  over  the  set  0  -  {o^,o2l,  T^ 
has  most  weight.  Suppose  this  test  result  is  also  negative. 
Then  we  continue  in  this  manner.  Thus  the  binary  splitting 
algorithm  selects,  in  order,  tests  T3,T^,Tj,  . . .  »Tin+2»Tni+3  ln 
order  to  isolate  Og.  A  positive  response  anywhere  along  this 
path  would  lead  to  isolating  other  objects  with  path  length  at 
least  one.  Thus 

(**)  K*  >  m(l-e)+e 
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>  Sj2(l-€)+6 

>  J  for  e  <  1/4  ,  n  >  8. 
Putting  (*)  and  (**)  together,  we  have 


for  e  <  i 
n 


9 


n  >  8. 


4.  Singleton  Tests. 

In  the  last  section  we  saw  that  for  arbitrary  object 
weights  the  relative  performance  of  ^binary  splitting  procedures 
to  optimal  procedures  can  be  about  as  bad  as  can  be  contemplated. 
In  Section  2  we  saw  that  the  same  ratio  K'/K0pt  for  equal  object 
weight  problems  was  poorer  than  expected  but  not  nearly  so  bad. 
In  this  section  we  find  an  interesting  restriction  of  the 
arbitrary  object  weight  problem  set  where  the  ratio  K'/K0pt  has 
nearly  as  good  a  bound  as  the  equal  weight  case.  The  restriction 
is  that  the  test  set  include  all  singleton  tests. 

A  test  set  T  is  singleton  complete  if  all  singleton 
tests  are  present  in  7  .  We  shall  label  the  test  {o^}  by  Tg^. 

We  recall  that  here  all  tests  have  unit  cost  and  that 
log  x  denotes  logjX. 

We  state  the  first  result. 
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Theorem  3.  Given  a  binary  identification  problem  with  n 
objects,  n  >  2,  weights  p^,...,pn  and  a  singleton  complete  test 
set  X  ,  such  that  some  test  procedure  requires  at  most  c  log  n 
tests  to  identify  any  object,  then 


K  *  2c  log2n _ 

KQpt  —  1  +  logc  +  log  log  n 


+  2c  log  n 


The  upper  bound  here  is  seen  to  differ  from  the  Garey  and 

Graham  result  by  the  multiple  log  n.  Curiously,  this  comes  not 

from  the  bound  on  K*  but  from  the  lower  bound  on  K _ . .  A  minor 

opt 

(but  important)  distinction  also  is  that  we  do  not  want  to 
require  the  optimal  test  procedure  to  identify  any  object  in 
c  log  n  tests  but  only  require  some  test  procedure  to  have  this 
property.  When  arbitrary  weights  are  involved  a  non-optimal 
procedure  may  have  this  property  when  an  optimal  procedure  does 
not. 

To  obtain  this  result  we  use  a  modification  of  the  lemma 
of  Garey  and  Graham  stated  at  the  end  of  Section  2. 

Lemma.  Given  a  binary  identification  problem  with  n  objects, 
if  there  exists  an  r,  0  <  r  <  1/2,  such  that  for  each  S£0  with 
| S |  >_  2  either 

(a)  There  exists  a  T#^  e  X  such  that 


P(SOTgi) 

P(S) 


>  1/2 


or 


(b)  There  exists  a  T6 7  such  that 


then 


r.p(S)  <  p(SOT)  <  (l-r).p(S) 


R' 


<  _ 12a.  " 

-  r  log  (l/r) 


+ 


1-r 

r 
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Proof  of  Lemma. 

* 

The  proof  is  by  induction  on  n.  The  result  is  seen  by 
inspection  to  hold  for  n»l,  n*2 ,  and  n»3.  (Note  for  n»3  that 
Kf  <  2.)  We  show  that  the  lemma  holds  for  each  nQ  >  4.  To  begin 
the  induction  step  proof  we  may  assume  that  the  lemma  holds  for 
all  n  <  nQ.  Assume  the  hypotheses  of  the  lemma  hold  for  nQ  and 
that  the  binary  splitting  algorithm  generates  a  test  procedure. 
The  first  test  splits  0  into  S  and  S  of  weight  p(S)  and  p(§) 
respectively.  Kg  (  Kg)  denotes  the  expected  number  of  tests 
required  for  S(S)  by  the  algorithm. 

We  suppose  that  S  and  s  are  determined  by  condition  (a) 
of  the  lemma  and  that  Tft^  is  the  singleton  test  such  that 
S  »  ODT#l  and  p(S)  >  1/2.  Here  Kg  ■  0  because  |S|  ■  1. 


Therefore, 
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R  <  (l-p(S)) (Ry+l)  +  p(S)  »  (l-p(S))  Kg  ♦  1. 
By  the  induction  hypothesis 


*  i  «r'ilj"a}r)  ‘Tl*1 

i  r»  ♦  ¥  »  ♦  1 


Por  1  <  1/3,  we  note  that  >  2,  so 


K  ±  1/2  ♦  *>  ♦  1 


< _ Isa  a  +  izi 

-  r  log  (1/r)  r 


Por  1/3  <  r  <  1/2,  we  show  that  the  result  holds  for 


n  >  4. 


n  >  4, 


log(n-l)  >  log  3 


£  log (1/r)  ,  for  1/3  <  r, 
£  2r  log (1/r) . 

Therefore, 


t*nj"ffir)  i  l- 
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Using  the  induction  hypothesis. 


*  i  V*  <r1igjna)r)  *  ¥ 


)  ♦  1 


<  — 
—  r 


log (n-1) 
log  (1/r) 


♦ 


1-r 

r 


Since  the  argument  is  valid  for  any  test  procedure  generated  by 
the  binary  splitting  algorithm,  the  result  holds  with  K' 
replacing  K. 

* 

The  case  that  condition  (b)  determines  S  and  S  follows 
exactly  the  argument  in  Garey  and  Graham  [5].| 


The  proof  of  Theorem  3  is  a  variant  on  the  proof  of 
Theorem  2  of  Graham  and  Garey. 

Proof  of  Theorem  3. 

Consider  a  binary  identification  problem  satisfying  the 
theorem  hypothesis.  We  show  that  the  Lemma  holds  with 

r  "  ic  log  n 


For  convenience,  let  A  »  c  log  n 


Let  S  be  any  subset  of  0  with  |  S  |  >.  2 •  He  show  that  if 
condition  (b)  of  the  Lemma  does  not  hold  then  condition  (a)  must 
hold. 

Suppose  condition  (b)  fails  for  r  ■  That  is, 

P(S  n  T)  <  3a*P(S)  ,  or 

p(snT)  >  (i  -  p(S)  ,  ail  reT. 

If  at  most  lAJ  (the  integral  part  of  A)  tests  are  then  applied  in 
any  order,  and  all  get  appropriate  responses,  we  can  have  a  set 
SA  remaining  such  that 

P(SA)  >  p(S)/2. 

SA  must  be  a  singleton  set  for  otherwise  the  hypothesis  is 
violated  that  every  object  is  identifiable  within  A  tests  by  some 
testing  procedure.  But  then  there  exists  a  singleton  test  Tg^ 
such  that 

p(SDT8i)  >  p(S)/2 

which  is  condition  (a)  of  the  Lemma.  Thus  the  Lemma  is 
applicable  under  the  theorem  hypotheses. 

Applying  the  Lemma  with  r  ■  l/(2c  log  n)  we  get 
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*'  i  Ulo|Cc^"lo9  n  ♦  *el°*  n  *  1 
Sine*  we  know  that  K0pt  >  1  w#  have  our  result.  | 

In  the  proof  of  Theorem  3  we  made  a  careful  analysis  of 
K'  but  used  the  immediately  obvious  lower  bound  of  1  for 
We  see  by  example  that  there  is  a  class  of  binary  identification 
problems  that  satisfy  the  conditions  of  Theorem  3  for  which 
K0pt  <  2  regardless  of  the  number  of  objects  in  0.  Thus  the 
lower  bound  cannot  be  improved.  (It  is  clear  that  for  this  class 
the  ratio  K'/K^  is  not  close  to  the  bound  of  Theorem  3.  We 
consider  this  after  stating  the  example.) 

Example?  The  binary  identification  problems  given  here  satisfy 
the  hypotheses  of  Theorem  3  with  c  »  l+€(n)f  where 

6(n)  ■  0,  and  have  K _ .  <  2. 

n— — >03  opt 

Consider  0  ■  {o,,...,o  }  with  p^  •  2“*,1  <  i  <  n-1,  and 
Pn  ■  2”*n“^.  All  possible  tests  exist. 

The  potentially  complete  binary  tree  (where  the  minimum 
and  maximum  path  lengths  to  leaves  differ  by  at  most  one)  is  a 
possible,  but  non-optimal,  testing  procedure  where  every  object 
is  identifiable  using  log  nl  tests.  (Here  r*l“  the  least 
integer  not  less  than  m.)  The  expected  cost  here  is  0(log  n) . 

In  Figure  3  we  present  an  alternate  procedure  with  much 
lower  (indeed  optimal)  expected  cost.  This  is  the  testing 
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procedure  that  is  found  by  the  binary  splitting  algorithm.  The 
tree  is  long  and  thin  (we  will  call  it  a  vine;  see  next  section) 
but  this  allows  the  larger  weights  to  get  closer  to  the  root 
which  can  often  result  in  the  lowest  expected  cost.  For  this 
testing  procedure  we  have 

K  <  (  X  i  2'1)  (n-l)2”(n”1)  <  2. 

i-1 

Thus  *0pt  <  2,  uniformly  in  n. 

5.  Vine  Testing  Procedures. 

* 

The  binary  identification  problem  considered  at  the  end 
of  the  last  section  points  up  a  weakness  of  Theorem  3.  There  are 
binary  identification  problems  where  the  optimal  testing 
procedure  has  a  long  and  "thin"  decision  tree  which  leads  to  a 
very  small  expected  cost.  The  binary  splitting  algorithm  often 
finds  such  a  testing  procedure  which  means  that  the  bound  given 
by  Theorem  3  grossly  overestimates  the  ratio  R’/R0pt  (if  one  is 
even  fortunate  enough  to  satisfy  the  "reachability"  hypothesis) . 
In  this  section  we  prove  a  theorem  that  gives  another  upper  bound 
on  the  expected  cost  for  the  binary  splitting  testing  procedures. 
This  bound  is  particularly  useful  in  those  cases  where  Theorem  3 
is  least  useful ,  namely*  when  the  optimal  testing  procedure  has  a 
long  and  thin  decision  tree. 


The  theorem  also  leads  one  to  conjecture  that  Theorem  3 
is  not  a  strong  upper  bound  because  the  binary  identification 


problems  where  *0pt  <  constant  regardless  of  problem  size  are 
seen  to  have  upper  bounds  on  K'  well  below  that  given  by  Theorem 
3.  (We  conjecture  that  the  upper  bound  for  K'/K0pt  for  binary 
identification  problems  with  arbitrary  object  weights  and 
singleton  .complete  test  sets  is  the  same  as  the  equal  object 
weight  case  established  by  Garey  and  Graham.) 

The  theorem  also  has  some  intrinsic  interest  as  a  theorem 
on  weighted  binary  trees. 

The  testing  procedures  we  study  here  are  vine  procedures. 
A  vine  is  a  binary  tree  where  all  interior  nodes  lie  on  one 
branch.  (We  observe  that  for  decision  trees  all  interior  nodes 
have  two  sons.)  The  tree  of  Figure  3  is  a  vine.  For  a  vine  each 
interior  node  is  adjacent  to  (at  least)  one  leaf  node.  A  vine 
procedure  is  a  testing  procedure  whose  decision  tree  is  a  vine. 
An  optimal  vine  procedure  is  a  vine  procedure  such  that  if 
w^  and  Wj  are  weights  that  label  leaves,  and  w^  >  w ^ ,  then  the  w^ 
leaf  is  closer  to  the  root  than  is  the  Wj  leaf. 

A  binary  identification  problem  with  a  singleton  complete 
test  set  always  has  an  optimal  vine  testing  procedure,  although 
many  times  the  procedure  may  have  relatively  high  expected  cost. 
However,  we  have  noted  that  when  the  object  weights  are  quite 
skewed  the  optimal  vine  procedure  can  be  of  very  low  expected 
cost.  It  would  be  nice  to  relate  the  vine  procedure  to  the 
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binary  splitting  procedures,  especially  as  follows:  the  optimal 
vine  procedure  expected  cost  (Ky)  and  any  binary  splitting 
procedure  expected  cost  (KBg)  satisfy  KBg  £  Ky.  Unfortunately, 
this  does  not  always  hold  as  Figure  4  shows  us.  However,  this  is 
the  nature  of  the  result  we  seek  since  this  would  give  a  good 
bound  on  binary  splitting  procedures  when  the  best  testing 
procedures  have  long  and  thin  decision  trees. 

Although  we  cannot  realize  J^g  <  we  are  able  to  show 
that  matters  do  not  get  worse  than  is  suggested  by  the  example  of 
Figure  4. 

■f 

Theorem  A.  For  any  binary  identification  problem  where  the  test 
set  T  contains  all  singleton  tests  and  for  any  binary  splitting 
testing  procedure  for  this  problem  we  have 

KBS  -  *v  +  1 

where  K^g  is  the  expected  cost  for  the  binary  splitting  procedure 
and  Ky  is  the  expected  cost  for  the  optimal  vine  procedure. 

In  particular,  for  any  binary  identification  problem  with 
a  singleton  complete  test  set  we  have  K'  <  Ky  ♦  1,  where  K*  is 
the  worst  case  expected  cost  for  binary  splitting  procedures. 

Proof.  The  proof  is  presented  entirely  in  terms  of  weighted 
binary  trees  except  for  one  key  property  of  BS  trees  we  prove 


below.  Because  every  testing  procedure  is  represented  by  a 
decision  tree,  we  may  assume  we  are  given  a  BS  tree  Tgg  and  we 
will  show  that  the  theorem  statement  holds  by  producing  an 
(optimal)  vine  tree  Tv  such  that  <  Ky  ♦  1. 

Before  we  can  state  the  key  property  pertaining  to  BS 
trees  we  require  some  definitions.  For  any  weighted  binary  tree 
(such  that  each  interior  node  has  two  subtrees)  one  can  choose  a 
leaf  labeled  by  w  and  consider  the  path  of  (zero  or  more) 
interior  nodes  between  the  leaf  and  the  root  of  the  binary  tree 
(the  w-path) .  Each  interior  node  on  the  w-path  has  another 
subtree  attached  to  the  node,  a  secondary  subtree  of  the  w-path. 
(In  Figure  4,  the  .49-path  on  the  BS  procedure  tree  has  two 
secondary  subtrees  with  one  and  seven  leaves  respectively.)  We 
define  the  leaf  weight  of  a  weighted  binary  tree  to  be  the  sum  of 
the  weights  of  leaves  of  the  tree. 

We  prove  the  following  fact  regarding  BS  trees;  this  is 
the  only  property  specific  to  BS  trees  that  we  need. 

Fact.  For  any  BS  tree  TBg  and  any  weight  w  labeling  a 
leaf  of  £bs*  every  secondary  subtree  of  the  w-path,  except 
possibly  the  subtree  closest  to  the  leaf,  has  leaf  weight  at 
least  as  large  as  weight  w. 

We  prove  the  Fact  by  assuming  it  false  and  deriving  a 
contradiction.  Let  c(  denote  a  node  on  the  w-path  where  the 
secondary  tree  has  leaf  weight  s,  s<w,  and  4  is  not  adjacent  to 
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the  leaf  (labeled)  w.  Node  c(  Is  the  root  of  a  tree;  let  the  leaf 
weight  of  this  tree  be  t.  t  is  the  weight  that  is  "split  up”  at 
node  c(.  We  must  have  w  <  t/2  or  else  the  optimal  split  is  (w,t- 
w)  and  the  node  w  is  one  subtree,  violating  our  supposition 
that  c(  is  not  adjacent  to  node  w.  But  if  t/2  >  w  >  s,  then 
(w,t-w)  is  closer  to  (1/2  ,  1/2)  than  is  (s,t-s)  and  would  be 
favored  by  the  binary  splitting  algorithm.  But  then  one  subtree 
to  c(  again  would  be  node  w,  making  nodes  w  and  o(  adjacent,  which 
violates  our  supposition.  Thus  s<w  is  impossible  and  the  Pact  is 
proven. 

The  proof  of  the  theorem  proceeds  by  building  the  given 
tree  Tgg  and  also  the  corresponding  T^  in  stages.  We  define 
trees  T0,TBgl,TBg2, . . . ,  TBgn  «  TBg  and  I0»£vl» • • • »£Vn  *  Ty  where 
Tq  is  a  single  node  tree  with  weight  1,  and  Tgg  and  Tv  are  the 
trees  of  the  binary  splitting  procedure  and  the  optimal  vine 
procedure  respecti  ;ly.  The  proof  is  by  induction  on  the  number 
of  stages. 

The  proof  is  better  understood  if  we  prove  a  restricted 
case  first.  We  shall  assume  that  for  the  given  Tgg  which  we  must 
reconstruct  that  there  is  no  w-path  with  a  secondary  subtree 
whose  leaf  weight  is  less  than  w.  We  shall  see  that  in  this 
case  that  for  *11  i»  1  <  i  <  n,  and  Kgg  <  Ky. 

To  define  Tgg^  and  Tv^  we  begin  with  Tq.  Let  w^  be  the 
largest  weight  in  Tgg  and  find  the  w^-path  in  Tgg.  Consider  the 
vine  defined  by  this  w^-path  where  the  secondary  subtrees  in  Tpg 
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are  replaced  by  single  node  subtrees  with  weight  equal  to  the 
leaf  weight  of  the  secondary  subtree  it  replaces.  He  call  the 
single  node  subtrees  secondary  nodes  for  the  w-path.  (For  the  BS 
tree  of  Figure  4  the  secondary  nodes  for  the  .49-path  have 
weights  .01  and  .5  assigned  respectively.  See  Figure  5.)  The 


vine  just  defined  is  Tp^and  T^.  Let  bn****»bir  (where 
r^  £  1)  be  the  secondary  nodes  created  in  the  vine  defined  above, 


enumerating  from  the  leaf  w^. 
the  leaf  weights  (i.e.,  labels) 


Thus  here  w»bn»  •  •  •  »bir 

for  Ibsi  and  Jvi* 


are  all 


We  now  construct  TBg2  and  Tv2.  To  construct  TBg2  from 
TBSi»  let  w2  be  the  next  largest  we,ight  in  Tgg  after  w^.  Find 
the  w2-path  in  Tpg.  If  w2  “  wi  then  w2  ■  b^  ,  some  k,  is 
possible  and  then  Tgg2  »  Tp^ .  Otherwise,  the  first  portion  of 
the  w2-path  from  the  leaf  is  contained  in  a  secondary  subtree  of 
the  Wj-path  of  TBg.  Let  blk  label  the  secondary  node  in  Tpgl 
associated  with  the  secondary  subtree  containing  part  of  the 


w2-path.  We  form  Tpg2  from  Tpgl  by  replacing  node  b^k  by  the 
vine  that  completes  the  w2-path  in  Tpg2.  This  vine  has  secondary 
nodes  b21'***'b2r  t£>  rePl«ce  secondary  subtrees  in  Tpp  along  the 
w2-path  where  it  is  distinct  from  the  w^-path.  (See  Figure  5  for 
an  illustration  of  ?bsj/£b§2'  and  — V2  ^or  tbe  BS  tree  of  Figure 


To  create  Tv2  from  Tv^  we  expand  b^k  in  the  identical  way 
except  we  must  first  move  (the  label)  b^k  to  the  node  (labeled) 
w^  so  that  the  expansion  of  node  b^k  to  a  vine  results  in  another 
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vine.  We  interchange  (labels)  w  and  blk  to  achieve  this  move. 
By  the  Pact  (and  our  simplification  assumption)  b^  >_  w^  so  the 
expected  cost  K  cannot  be  decreased  by  this  move,  since  there  is 
a  net  weight  change  farther  away  from  the  root  ("down  the  tree") 
if  any  change  occurs.  Now  replace  b1Jc  by  the  same  vine  replacing 
blk  in  the  creation  of  Tgg2.  This  defines  Tv2*  We  see  that 

KBS2  i  "W  (Reca11  th>t  ''bsi  i  *V1  since  Ibsi  ■  Jvi-’ 

The  general  outline  should  be  clear.  Tgg  is  being 
"reconstructed”  by  expanding  secondary  nodes  so  that  the  subtrees 
are  gradually  rebuilt.  The  tree  Tv  is  gradually  built  by 
appending  all  the  vines  used  in  intermediate  construction  of  Tgg 
at  the  "end"  of  a  previous  vine,  to  preserve  "vinehood" .  The 
general  form  for  the  restricted  case  can  now  be  presented. 


To  construct  Tggi  from  Tpg^  ^  ,  locate  the  i  largest 


weight  on  Tgg.  If  w^  already  labels  a  secondary  node  b^  in 
-BS(i-l)  80  that  the  wi“Path  of  Tgg  is  present  in  ^s(i-l)  then 


?BS1  ”  .?BS(i-l)  *na  ?Vl  ’  Iv(i-l)- 


Otherwise,  the  first  portion 


of  the  Wj-path  must  replace  the  node  b^  where  the  w^-path  joins 
the  portion  already  present  in  Tpg (il) *  secondary  nodes 
with  appropriate  weights  represent  the  secondary  subtrees  of  the 


w^-path  not  yet  expanded.  This  defines  Tgg^.  To  define  ?Vi' 
label  bjk  is  moved  to  replace  Wj^  at  one  of  the  farthest  leaves 
of  •  However,  rather  than  placing  Wj_j  in  the  b^ 
location,  we  bump  up  the  weights.  That  is,  the  closest  weight  wa 
labeling  a  node  farther  from  the  root  than  (i.e.  "below”)  b^  is 
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moved  to  the  bj^  location.  The  w#  location  now  vacant  is 

replaced  by  the  closest  weight  below  the  w  location.  This 

B 

continues  and  finally  weight  w^_^  moves  to  the  vacant  location 
left  for  it.  Thus,  we  observe  that  if  w#  and  wfe  label  nodes, 

a,b  <  i-1,  and  w  >  w.  ,  then  wB  remains  closer  to  the  root  then 

wfc  in  all  Ty^.  Finally,  replace  b^k  at  a  "bottom*  leaf  by  the 
vine  added  to  Tp-  ^  p  at  this  defines  Tvi. 


He  now  show  that 

^BS(i-l)  —  *V(i-l)  by  in<5uction 
cost  is  a  sum  of  components,  each 


*BSi  —  *vi»  assuming  that 
hypothesis.  Since  the  expected 
component  the  product  of  a 


weight  and  its  distance  from  the  rogt,  any  weight  displaced  an 


equal  amount  in  creating  Tp^  and  Ty^  will  have  equal  effect  on 
^BSd-l)  an<*  *7(1-1)*  thus  preserving  the  inequality.  Thus  the 
replacement  of  b^  by  the  same  vine  in  the  creation  of 
— BSi  and  Zyi  preserves  the  inequality.  He  must  only  note  that 
moving  b^  to  the  Wj_^  location  and  bumping  up  weights  preserves 
the  inequality.  But  wq+i  i.  wq  •  *11  q  <  i,  also  bjk  >  w^  (by 
the  Fact)  and  w^  below  b^  in  Ty(i«i)  implies  q>j.  It  follows 
that  "bumping  up"  results  in  no  larger  negative  effect  on  the 


expected  cost  than  if  w^  were  moved  from  the  w^_^  location  up  to 
the  bjk  location.  Moving  b^k  down  to  the  location  at  least 
offsets  this  negative  effect  so  Ky  is  not  decreased  by  this 
action.  Thus  Kpsi  <  Ky^  holds. 


He  now  consider  the  unrestricted  case  where  the  secondary 
subtree  on  a  w-path  closest  to  node  w  may  have  leaf  weight  less 


than  w.  If  a  w^-path  in  Tpg  has  such  a  secondary  subtree  we 
shall  call  it  the  w^-  runt  (or,  simply,  the  runt) ,  denote  its 
leaf  weight  by  a^,  and  exclude  it  in  our  notation  bu»*»*»bir 
for  secondary  nodes  of  the  w^-path.  This  means  b^  denotes  the 
leaf  weight  of  the  closest  secondary  subtree  to  w^  such  that 
b^  >  Wj.  The  runt  is  too  small  to  stand  on  its  own;  it  will 
usually  be  associated  with  another  node  and  we  proceed  as  before 
as  much  as  possible. 

The  change  is  easy  when  constructing  Tpg^  from  Tpg p  * 
If  the  Wj-path  has  a  runt,  then  attach  it  to  the  w^  node,  which 
then  has  weight  w^  +  a^  ■  w’^.  If  no  runt  exists  we  let  *  0 
for  convenience.  £psi  tben  bas  a  V^-path  with  bi]/»»**bir  as 
before  but  b^^  w'^  may  not  hold.  If  the  w'^-path  requires  the 

expansion  of  a  w^-runt,  some  j<i,  in  2BS(i-l)  then  rep!ace  the 
w * ^  node  by  an  (unlabeled)  node  with  two  sons,  labeled  w^  and  a^ 
respectively,  and  then  expand  the  a^  node  as  one  would  a  b^k 
node. 


We  now  consider  the  creation  of  from  Tv^_j^  .  There 
are  three  cases  which  we  enumerate.  In  case  (ii)  we  shall  see 
that  an  a^  may  be  shifted  from  w'^  to  b^,  creating  weights 
Wj  and  b'ji  “  bjj  +  respectively.  Thus,  in  we  must 
also  consider  b*^  of  form  b^  +  a ^ .  Also  weights  of  form 
Wj  +  a^  can  appear,  as  will  be  seen. 

The  following  possibilities  arise  in  2y(i-l) *  Recall 
that  the  expansion  of  a  node  to  a  vine  is  determined  by  the 
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expansion  that  occurs  to  create  TBgi. 

Case  (i) .  The  "node"  needs  to  be  expanded.  Replace 

node  w' by  a  node  with  sons  Wj_^  and  a^_^  and  then  expand  a 
in  imitation  of  the  expansion  that  creates  Zgsi* 


Case  (ii) .  A  node  bjk  needs  to  be  expanded.  Since  Wj_^, 
rather  than  w'j^,  is  to  be  "bumped  up",  move  to  to 

create  b’ (i-i)i  as  described  earlier.  (Note  that  the  distance  of 
to  the  root  is  unchanged  so  the  expected  cost  is  unaffected 
by  this  move.)  Now  move  bjk  to  the  Wj_^  node  and  bump  up  the  w's 
below  bj^'s  old  location,  as  before.  If  b*jk  i*  bJL.  +  au  then  do 
not  move  a..  After  "bumping  up",  the  6ld  b* 


jk 


jk  T  “j 
node  will  have 


weight  w  +  a.,  for  some  s. 

®  J 


Case  (iii).  A  "node"  8j,  some  J<i-1,  needs  to  be 
expanded.  The  a^  is  detached  from  its  associate  b^  or  wfi,  the 
latter  left  in  place,  and  aj  is  moved  down  the  vine.  The  w^_^ 
node  is  replaced  by  a  node  with  sons  w^_^and  a j .  Then  8j  is 
expanded . 


We  now  establish  the  relationsip  between  the  expected 
costs  of  RBgk  sod  Kyj'*  The  inequality  that  we  actually  show 
holds  is 


*BSk  i  *Vk  +  «  "h 

hSRk 

for  {l,. . . ,n} ,  for  all  k,  1  <  k  <  n.  It  is  quickly  seen  from 
our  earlier  argument  that  (*)  holds  for  k-1  with  ■  0.  We 
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argue  the  induction  step  from  i-1  to  i,  following  the  cases  just 
enumerated.  We  have  by  induction  hypothesis  that  (*)  holds  for 
k-i-1. 

When  Case  (i)  occurs  for  Ty^,  ^BSi  bas  b*en  created  by 
the  same  expansion  applied  to  so  the  incremental  change 
to  and  ^vd-i)  ia  «x»ctly  the  same.  Thus  (*)  holds  for 
k*i  with  *  H^. 

When  Case  (ii)  occurs,  the  shift  of  a^_^  to  ^(i-iji 
causes  no  consequences  to  the  expected  value  and  the  remainder  of 
the  action  is  as  for  the  restricted  case  considered  earlier. 
Thus  (*)  holds  for  k»i  with  ■  H^. 

When  Case  (iii)  occurs,  weight  a^  is  shifted  down  the 
vine  which  increases  the  expected  value  Ky^  without  affecting 
Kbs..  This  is  fine.  However,  in  £BS(i-l)  is  part  of  w*^  and 
when  split  to  two  sons,  w^  and  a j ,  their  path  length  increases  by 
one  and  so  both  weights  are  added  (once)  to  *BS(i-l)‘  In 
modifying  Zv(i-l)'  wi-l  *8  9^ven  an  increased  path  length  of  one 
and  a^  has  a  path  length  increase  of  at  least  one.  But  w^  >  w^_^ 
in  general  so  the  increase  to  can  excee<5  the  increase  to 
^Vti-l)  by  an  ainount  approaching  w^,  if  w^_^  is  very  small.  To 
preserve  the  inequality  we  must  add  w^  to  the  right  hand  side  of 
(*) .  With  this  accommodation  we  see  that  (*)  holds  for  k«i  for 
this  case  with  -f  {j}. 


This  concludes  the  cases  we  must  consider,  and  (*) 
been  shown  to  hold  in  particular  for  k»n. 

*bs  ■  KBSn  "d  *v  *  *vn-  u  foU“"  th,t  'as  i  *v  +  1  ,lne* 
sum  of  the  weights  is  1.  | 


(!• 
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expected  cost:  K.  *  2(.3  +  .1+  .3)  +3(.2+  .1) 

-  2 ( . 7)  +  3 ( . 3) 

-  2.3 


A  binary  identification  problem  with  one  testing  procedure  solution 

Figure  1 


expected  cost:  K  ■  2(.3  +  .3  +  .2)  +  3 ( . 1  +  .1) 

♦ 

-  2(.8)  +  3(. 2) 

=  2.2 


An  alternate  testing  procedure  for  the  binary 
identification  problem  of  Figure  1. 


Figure  2 


